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ABSTRACT 

The National Assessment of Educational Progress 
(NAEP) recently expanded its national survey to include state 
representative samples on a trial basis. In 1992, the NAEP undertook 
studied non"response in the trial assessment through two simulation 
projects. The first looked at state assessments that were similar to 
NAEP tests for states with low NAEP response rates. The second 
project took states with 100 percent response rates and simulated 
levels and patterns of non-response of states with low NAEP response 
rates. In the first case, the simulation using figures from Illinois 
suggests that it might be useful to explore the use of substitution 
further for other states. Results suggest that one could simulate the 
patterns of non-response based on states with less than 100 percent 
response rates to try to develop a sense of when non-response becomes 
so large it has an impact on the estimated results. Two tables 
present analysis results. (SLD) 



Vc 5'c Vr Vr Vr :V ">': t't ">': Vr Vr Vr Vr Vr Vr i: Vr Vr Vr Vc Vc Vc Vc Vr Vr Vc ^'f Vr Vr Vc Vc Vc Vc Vr Vc Vr Vc Vc Vc Vc Vc Vc Vc Vc Vc Vr ^'r Vf Vr Vc Vc i: Vr Vc Vt Vr Vc Vr Vc V 

Reproductions supplied by EDRS are the best that can be made '" 
from the original document. 

Vf Vc Vc Vr Vf Vc Vr Vc ^'c Vc Vc Vr Vc Vc Vc Vc Vc Vc Vc :'c Vc Vc Vc Vc Vc Vr Vc Vc Vr Vc Vc Vc >'r Vr Vr Vr Vc Vc V: Vc it 5'f Vr Vc Vc Vc Vc Vc Vc Vr Vc Vc Vr Vc ^ 



U S DEPARTMENT OF EDUCATION 

Otticf ot Educational Reseaicn anfl imD'ovempnt 

EOl/CATlONAL RESOURCES INFORMATION 
7 CENTER (ERfCi 

3r Tnis documeni rias t>een lep'Oduced as 
'ece'veo tfon-> ir>e oe^so" O' o'ga"»iatiO" 

Ongmating -t 

r Mtno^ cpianges nave bc-e" made lo impiOve 
'ep'oduclfon quaiiiy 

• Po'OJS o' view o» opinions slated if*-'S docu 
merti do noi neressa^'y 'eceseni oHicia' 

OERl position or poliCy 



NATIONAL ASSESSMENT OF EDUCATIONAL PROGRESS (NAEP): 

NONRESPONSE STUDY 



Douglas Wright & Michael P. Cohen, National Center for Education Statistics* 
Michael P. Cohen, 555 New Jersey Ave., NW, Washington, DC 20208-5654 



BEST COPY AVAILABLE 



NATIONAL ASSESSMENT OF EDUCATIONAL PROGRESS (NAEP): NONRESPONSE SITJDY 

Dougjag Wri£^ and Michael P. Cohea, National Ceitter for Education Statistics* 
Michael P. Cohen, 555 New Jersey Avenue NW, Washington, DC 2Q208-S6S4 



KEY WORDS: Bias, variance, simulation, 
schools, states 

bxtToducdon 

Recently, the National Assessment of 
Educational Progress (NAEP) expanded from a 
national survey to state r^resentative samples on 
a trial basis. Cooperation generally has been 
quite good with 38 states participating in the 8th 
grade mathematics assessment in the furst year 
(1990) and 42 states in 1992. 

School cooperation rates with states have 
varied from state to state from a low of 62% (in 
1992) to 100%. The lower response rates have 
raised concerns about potential nonresponse bias. 
In 1992 NCES undertook to study the unpact of 
this nonresponse through two simulation projects. 

The first project looked at State assessments 
that were sunilar to NAEP tests for states with 
low NAEP response rates. Often states will 
conduct their own assessments on a census of 
their schools. For such states we could calculate 
the difference between the estunated average state 
assessment score based on the NAEP sample 
respondent schools and weights and that based on 
the complete census. We could analyze these 
differences both at the state level and for substate 
categories. 

The second project took states with 100% 
response rates and simulated the levels and 
patterns of nonresponse of states with low NAEP 
response rates. 

I. Background 

Before we explore the methodology for 
estimating nonresponse bias and other related 
issues, it is useful to understand the NAEP 
methodology for estimation. 

The basic survey design for a given state 
involvwl the sdection of approximately 100 
schools with probability proportionate to size of 
8th grade enrollment. The schools were 
(implicitly) stratified based on urbanici^, percent 
of minority enrollment, and household income. 
The amount of stratification dq)ended on the size 
of the state school samplmg frame relative to the 
sample size. Larger states permitted greater 
implicit stratification. Some schools were selected 
with certainty. 

If possible, given a school refusal to 
cooperate, a substitute school with similar 



characteristics was selected and assigned the 
prob^Uity of the originally sampled school. 
About 30 students were selected per sampled 
school. Students were excluded from the 
test-taking if they were incapable of takmg the 
assessment. Exclusion rates for states ranged 
from 2 to 8 percent. 

The school estunation weight was based on 
two factors: the inverse of ^e probability of 
selection and the school nonresponse adjustment 
factor. In general, when schools within a state 
did not respond and were not substituted for, 
nonresponse classes were created based on 
urbanicity, percent minority, and median 
household income (die same variables that were 
used for unplicit stratification). Nonresponse 
adjustment classes varied from state to state 
d^ending on the distribution of these 
characteristics, absolute sample size within a 
potential nonresponse adjustment cell, and the size 
of the nonresponse adjustment factor. If any 
nonresponse class had fewer than six schools or a 
ratio greater than or equal to 1.35, it was 
collapsed (until the criteria were met). 

A second set of weights was determined for 
each sampled school to be used for variance 
calculation. These weights were based on the 
jackknife rq)Iication procedure. In the jackknife 
procedure the impact of nonresponse and 
nonresponse adjustment are reflected in the 
variance estimate. 

n. Sinnilations Based on State Assessment Tests 

Because of the timing, the contractor, 
Synectics, focused on the 1990 states, contacting 
a number of them with school-level nonresponse. 
Of these states, the contractor found only two that 
conducted state assessments on all their schools 
and that were comparable to NAEP. The states 
were California and Illinois. 

A. California 

California provided a list of 1577 schools with 
addresses and school scores for its state 
assessment. This file was matched to the NAEP 
school sample of 104 schools. There were 6 
nonrespondent schools and no substitute schools 
for the California NAEP assessment. The 
weighted correlation coefficient between the 
NAEP scores and tihe California Achievement Test 
(CAT) school scores was .92, based on the 98 



responding schools. This indicates a fair amount 
of compar^Uity, implying that results of the 
simulations based on the CAT scores woiild have 
sunil^r application to the NAEP scores. 

A number of possible estimates of bias were 
possible: 

bias,^, = - M* and 
bias,, = M,, - M*, where 

is the estimate of the CAT score for the 
respondent schools based on the mverse of the 
probability of school selection unadjusted for 
nonresponse. 

M,, is the estimate of the CAT score for the 
respondent schools based on the weight 
adjustments for nonresponse (i.e. we use the 
inverse of the probability of selecting the 
school times the weight for the nonresponse 
adjustment cell), and 

M* is the true average CAT school score taken 
over all schools m California. 

To be a good reflection of NAEP bias, the 
above school scores have been student weighted. 
TTie average score using just the school weights 
would probably be similar, but not identical. 

We can test whether tiiese estimates of bias 
are significantly different from 0, using the 
estimated variances of and M^,. The 
estimates of variance are based on the jackknife, 
the method used to calculate variances for NAEP. 
For California, the estimates were as follows: 



M,, = 270.10, S,, = 4.50, 
M,. = 269.65, S,. = 2.96, 
M* = 272.94, 

bias,« = -2.84, and 
bias,. = -3.29. 

In this instance the nonresponse adjusted 
estimate of bias is larger than the unadjusted 
estimate. Bas^ on the sampling variance, neither 
estimate of bias is significantly different from 0. 

Another estimate is of interest is the unbiased 
estimate (of the CAT score) based on the 
original (NAEP) sample of schools. This estimate 
for California, 269.43, is consistent witii the 
thought that sampling error dominates school 
nonresponse bias. 

B. Illinois 

The NAEP sample size for Illinois was 105 
schools, of which 23 schools were 



nonrespondents. Of these, 19 schools were 
ultimately substituted for, and 4 remained 
nonrespondents. Illinois tests eighth grade 
mathematics as a part of its Illinois Goals 
Assessment Program (IGAP). The weighted 
correlation coefficient between the NAEP scores 
and the IGAP state school scores was .93, based 
on the 101 responding schools. For Illinois we 
also estimated die unwei^ted standard deviations 
for the NAEP scores and the IGAP scores. 

Became substitution was employed in Illinois, 
it is possible to calculate four IGAP estimates in 
addition to the universe estimate M*. The first 
two are the same as before — M,«, based on the 
original sample cases and NAEP base weights, 
and M,„ based on the initial respondents 
(excluding substitutes) and the appropriate NAEP 
nonresponse adjustments. The third is Mj^, based 
on the original respondents plus substitutes and 
the fourtiii is M^. based on the original respondents 
plus substitutes wiUi the wei^ts adjusted for 
nonresponse. 

With the new estimate we are able to 
s^arate out the **nonresponse** bias, if any, into 
two components — that due to nonresponse and 
using Uie NAEP nonresponse weight adjustment 
methodology, and that due to nonresponse plus 
substitution and using the NAEP nonresponse 
weifi^ht adjustments. Tlie difference between the 
two could be considered an estimate of the impact 
of substitution. 

The two new estimates give rise to two new 
estimates of bias: 

biasju == M^u - M* and 
bias^ = Mj, - M*. 

As before, these can be tested to see whether they 
are different from 0. 

The results for Illinois are as follows: 

M„ = 243.75, Su = 6.01, 
M,. = 248.59, Su = 4.79, 

- 245.35, = 5.87, 
Mj. = 244.88, = 4.19, 
M* = 248, 

bias,, = -4.25, 
bias,. = .59, 

bias^, = -2.65, and 
biasa, = -3.12. 

We ^an see that the estimate of bias with 
substitutes and nonresponse adjustments is greater 
in absolute value than the estimate based on no 
substitution but with nonresponse adjustments. 
Given the size of the biases and their 
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estimated standard errors, we :^annot reject tbe 
hypothesis that the biases are equal to 0. 

The unbiased estimate H, (of the IGAP score) 
based on the original (NAEP) sample of sdiools 
for minois is 245.31. 

C. Conclusion 

While the above are onlj estimates of bias 
and ones that ar3 not statistically significant given 
the small sample sizes, furth^ research on the ^ 
effects of school nonresponse at the state level 
would be useful, especially with respect to states 
with much higher rates of nonresponse than 
California or Illinois. In addition, the estimate for 
Illinois prior to substitution (but adjusting for 
nonresponse) has a smaller estimated bias than 
after substitution indicating that it might be useful 
to explore fiirtiher the use of substitution. 

m. Simulation Study of Nonresponse Using 
States with 100% Sdiool Response 

Based on the 1990 NAEP state response rates 
it was decided to simulate nonresponse at three 
levels: 5%, 10%, and 20%. Three states with 
100% response were used as the simulation states: 
Georgia, Colorado, and Connecticut. Because the 
school response rate was 100%, the published 
estmiates for these states are unbiased estimates 
(unbiased at least, due to school nonresponse). 

In order to simulate nonresponse, it would be 
necessary to take a nonrandom sample of schools, 
eliminate them from the file, and make estimates 
based on the NAEP estimation system. It would, 
therefore, not be sufficient to eliminate a random 
sample of schools in the state, nor would it be 
sufficient to elimmate a random sample of schools 
within a state within cells based on the 
nonresponse variables because both of these would 
result in unbiased estimates that would converge 
to the sample estimate as the number of 
simulations is increased. 

As mentioned in the Background, the 
nonresponse adjustment cells were based on 
m^ian income, percent minority, and urbanicity. 
Therefore, any nonresponse bias, if it exists, must 
be the result of some other variable being 
correlated witih nonresponse, or, looked at another 
way, there must be a variable within the median 
income by percent minority by urbanicity 
nonresponse suljustment cells that exhibits one 
level for the respondents and another for the 
nonrespondents. 

To put the results in context, the following 
are tiie estimates and standard errors from the 
1990 NAEP Trial State Assessment of the NAEP 
average scores for Georgia, Colorado, and 
Connecticut: 
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State Estimate Standard Error 

Georgia 258 1.3 

Colorado 267 1.0 
Connecticut 270 1.1 

As mentioned earlier, this jackknife variance 
includes the variation due to nonresponse and 
nonresponse adjustment. 

Initially, schools were eliminated from the 
states completely at random. Schools were 
formed into nonresponse adjustment classes and 
any necessary collapsing was performed. TTien 
nonresponse adjustment factors were ^plied to 
the responding sdiools and the average score was 
calciilated. While this method did not tell us 
anything aiwut the bias, it did provide information 
about variation associated with nomesponse and 
subsequent nonresponse adjustment. 



Table 1: Georgia simulation with 3 rates 
of Donreq)ome and 30 observations. 







NonresDonse 


20% 


Obs. 


5% 


10% 


1 


258.300 


258.201 


257.502 


2 


258.648 


258.218 


259.865 


3 


258.082 


258.333 


257.578 


4 


258.334 


258.769 


258.747 


5 


258.498 


258.505 


258.566 


6 


258.531 


258.075 


257.470 


7 


258.702 


259.116 


257.733 


8 


258.545 


25'/ .612 


258.874 


9 


258.301 


257.938 


259.621 


10 


258.548 


258.253 


258.529 


11 


258.246 


258.832 


258.514 


12 


258.787 


257.465 


258.034 


13 


257.865 


258.670 


258.771 


14 


258.646 


257.965 


257.890 


15 


258.184 


258.337 


258.043 


16 


258.137 


257.821 


257.660 


17 


258.301 


258.614 


256.776 


18 


258.603 


258.372 


258.445 


19 


258.576 


258.555 


258.067 


20 


258.251 


257.521 


258.243 


21 


258.446 


258.163 


256.290 


22 


257.949 


257.571 


259.379 


23 


258.569 


258.580 


258.533 


24 


257.609 


256.885 


258.024 


25 


258.268 


257.936 


259.216 


26 


258.629 


258.105 


258.754 


27 


258.105 


258.214 


259.214 


28 


257.898 


257.605 


258.847 


29 


258.630 


258.286 


257.001 


30 


258.314 


257.977 


257.119 
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Table 2: Bias and standard deviation witii simulated nonreqx)nse for GecKgia and Colorado. 

Georgia Colorado 

bias s.d. _ bias ^s.d. 

5% Nonresponse .1088 .2842 .0637 .1692 

10% Nonresponse .1014 .4693 .0699 .3160 

20% Nonresponse .0012 .8489 ■ .0935 .4902 



We calculated 30 simulations for each of the 
states at 5%, 10%, and 20% nonresponse rates 
(see Table 1 — for the state of Georgia). 

Note: For readers familiar with NAEP, we 
mention that these results are for the first 
**plausible value/ but sunilar results are available 
for the other four plausible values. 

Observe that tiie simulated scores are quite 
close to the true value 258. (In fact, if we did 
enough simulations, the average of the composite 
scores would exactly equal the true value.) 

In Table 2, for each sunulation score, the true 
score has been subtracted, the difference squared, 
summed across the 30 simulations, and divided by 
30. The first thing to note is that the variance 
increases as the percent nonresponse increases 
from 5% to 20%. This is what we would expect, 
because there is greater variability with more 
nonresponse. 

The second piece of information that is 
somewhat instructive is the size of the variance 
(1.69) relative to the calculated standard deviation 
for Georgia of 1.3 (based on 100% response rate 
and no nonresponse adjustment). With 
nonresponse, two factors increase the variance: 
first, the variance is increased due to the 
decreased sample size and second, it is increased 
by the effect of nonresponse adjustments on 
weight variability. Therefore, if we take the 
standard deviation, approximately .8, and square 
it to get .64, then we might infer that the variance 
of the estimate for Georgia would be equal to 
1.69 -H .64 = 2.33, if the component of variance 
due to nonresponse and adjustment is assumed to 
be additive. Because a 20% nonresponse rate 
would inflate the variance by a factor of 1/.8 = 
1 .25, the resulting variance would be 
approximately 2.11. 

Now, if we divide the total projected variance 
2.33 by 2.11, we get a fector of 1.10 — an 
estimate of the increase in variance due to the 
application of nonresponse adjustment factors. 



This factor seems reasonable, and implies that the 
nonresponse adjustments only add about 10% to 
the estimated variances. (This factor could be 
verified anofter way by calculating the jackknife 
variance of one of tiie simulated samples.) 

IV. Future Work 

With respect to simulations based on state 
assessment tests, furdier research is desirable to 
see if there are other states that have tests 
compar^le to NAEP conducted on all schools. 

With respect to simulating the impact of 
school nonresponse on states witii 100% response, 
one could base the simulated patterns of 
nonresponse on the patterns actually observed in 
states witii less than 100% response. In 1992 one 
of the participating states had a nonresponse rate 
of 38%. One coiUd simulate the impact of 
various levels of nonresponse between 5% and 
45% and try to develop a sense of when the 
nonresponse becomes so large as to have a 
significant impact on the estimated results. 
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